A large-scale genome-wide cross-trait analysis reveals shared genetic architecture between Alzheimer’s disease and gastrointestinal tract disorders

Consistent with the concept of the gut-brain phenomenon, observational studies suggest a relationship between Alzheimer’s disease (AD) and gastrointestinal tract (GIT) disorders; however, their underlying mechanisms remain unclear. Here, we analyse several genome-wide association studies (GWAS) summary statistics (N = 34,652–456,327), to assess the relationship of AD with GIT disorders. Findings reveal a positive significant genetic overlap and correlation between AD and gastroesophageal reflux disease (GERD), peptic ulcer disease (PUD), gastritis-duodenitis, irritable bowel syndrome and diverticulosis, but not inflammatory bowel disease. Cross-trait meta-analysis identifies several loci (Pmeta-analysis < 5 × 10−8) shared by AD and GIT disorders (GERD and PUD) including PDE4B, BRINP3, ATG16L1, SEMA3F, HLA-DRA, SCARA3, MTSS2, PHB, and TOMM40. Colocalization and gene-based analyses reinforce these loci. Pathway-based analyses demonstrate significant enrichment of lipid metabolism, autoimmunity, lipase inhibitors, PD-1 signalling, and statin mechanisms, among others, for AD and GIT traits. Our findings provide genetic insights into the gut-brain relationship, implicating shared but non-causal genetic susceptibility of GIT disorders with AD’s risk. Genes and biological pathways identified are potential targets for further investigation in AD, GIT disorders, and their comorbidity.

A lzheimer's disease (AD) is the most prevalent form of dementia, characterised by neurodegeneration and a progressive decline in cognitive ability 1,2 . The disorder ranks as a subject of increasing global public health importance with consequences for wide-ranging social and economic adverse impacts on sufferers, their families, and the society at large 1 . By the year 2030, over 82 million people-and about 152 million by 2050-are projected to suffer from AD 1,2 . While AD has no known curative treatments, and its pathogenesis is yet to be clearly understood, a comprehensive assessment of its shared genetics with other diseases (comorbidities) can provide a deeper understanding of its underlying biological mechanisms and enhance potential therapy development efforts.
Several studies have reported a pattern of co-occurrence of dementia (and AD in particular) with certain gastrointestinal tract (GIT) disorders, microbiota, dysbiosis or medications commonly used in the treatment of peptic ulcer disease (PUD) [3][4][5][6][7][8][9][10] . For example, an observational study reported more than twice the odds of dementia in individuals with gastritis (adjusted odds ratio [AOR]: 2.42, P < 0.001, 95% confidence interval [CI]: 1.68-3.49) 3 . Another observational study found a significant association between regular use of proton-pump inhibitors (PPI, medications for gastritis duodenitis, gastroesophageal reflux disease [GERD] or PUD) and increased risk of incident dementia (hazard ratio [HR]: 1.44 [95% CI, 1.36-1.52]; P < 0.001) 4 . Similarly, lansoprazole (a PPI) was reported to promote amyloid-beta (Aβ) production 5 , the accumulation of which is central to one of the core hypotheses for the development of AD 11 . More recently, a longitudinal study reported more than a sixfold increased risk of AD in individuals with inflammatory bowel disease (IBD) [HR: 6.19, 95%CI: 3.31-11.57], predicting over five-fold increased incidence across all forms of dementia 7 .
The available evidence, thus, suggests comorbidity or some forms of association between AD and GIT disorders, although it is not clear whether GIT traits are risks for AD or vice versa. Regardless, these findings agree with the concept of the 'gut-brain' axis or the 'gastric mucosa-brain' relationship, which has been implicated between GIT-related traits and central nervous system (CNS) disorders including depression and Parkinson's disease [12][13][14][15][16][17] . A relationship between AD and GIT disorders or their comorbidity can worsen the quality of life of sufferers while contributing to increased healthcare costs.
Despite the increasing number of studies reporting an association between AD and GIT traits, the biological mechanism(s) underlying this potential association remains unclear. Moreover, contrasting evidence exists 7,18,19 , leading to a longstanding debate on the potential links of GIT traits to the risk of AD 15,[18][19][20] . Large-scale genome-wide association studies (GWAS), identifying an increasing number of single nucleotide polymorphism (SNPs), genes, and susceptibility loci, have been conducted separately for AD and a range of GIT traits [21][22][23][24] . Findings from these GWAS provide compelling evidence for the roles of genetics in the aetiologies of AD and GIT disorders including GERD, PUD, PGM (a combination of disease-diagnosis of PUD and/or GERD and/or corresponding medications and treatments-a potential proxy for PUD or GERD), gastritis-duodenitis, irritable bowel syndrome (IBS), diverticular disease, and IBD [21][22][23][24] . However, to the best of our knowledge, no study has leveraged the possible pleiotropy between AD and GIT disorders as a basis for discovering their shared SNPs, genes and/or susceptibility loci.
In this study, we analyse well-powered GWAS summary data to comprehensively assess the genetic relationship and potential causal association between AD and GIT disorders. We demonstrate a positive significant genetic overlap and correlation between AD and GERD, PUD, PGM, IBS, gastritis-duodenitis, and diverticular disease. Also, in a cross-trait GWAS meta-analysis, we identify many loci shared by AD and GIT disorders. Causality assessment reveals no evidence for a significant causal association between AD and GIT disorders. However, we identify shared genes reaching genome-wide significance for AD and GIT disorders in gene-based association analyses. Lastly, pathwaybased analyses show significant enrichment of lipid metabolism, autoimmunity, lipase inhibitors, PD-1 signalling and statin mechanisms, among others, for AD and GIT traits. Figure 1 presents a schematic workflow for this study. Briefly, we performed three broad levels of analyses-SNP-level, gene-level, and pathway-based analyses. First, we used the linkage disequilibrium score regression (LDSC) 25 to estimate the genetic correlation between AD and GIT traits, and the 'SNP effect concordance analysis' (SECA) 26 method for concordance in SNP risk effect assessment. Second, to identify SNPs and susceptibility loci shared by AD and GIT disorders, we carried out GWAS meta-analyses. We also applied the pairwise GWAS (colocalisation) method 27 to identify independent genomic loci with shared genetic influence on AD and GIT disorders. Third, using the Mendelian randomisation (MR) 28 and the Latent Causal Variable (LCV) 29 methods, we assessed potential (and partial) causal associations between AD and GIT disorders. Lastly, we performed gene and pathway-based analyses to identify shared genes reaching genome-wide significance and biological pathways for AD and GIT disorders. The largest publicly available AD summary statistics and GIT summary data from research consortia or public repositories were utilised for analysis (Table 1 and Supplementary Data 1).

Results
Genetic correlation between AD and GIT disorders. We assessed and quantified the SNP-level genetic correlation between AD and GIT disorders using the LDSC 25 analysis method. The apolipoprotein E (APOE) region has a large effect on the risk of AD; hence, we excluded APOE and the 500 kilobase (kb) flanking region (hg19, 19:44,909,039-45,912,650) from the AD GWAS. We also excluded SNPs in the 26 to 36 megabase region of chromosome six from the data given the complex LD structure in the human major histocompatibility complex (MHC). Notably, in analyses both with and without the APOE region, LDSC reveals a significant genetic correlation between AD and GIT traits ( Table 2). Genetic covariance intercept estimates were not significantly different from zero (Supplementary Data 2), indicating no sample overlap between our AD and GIT GWAS.
Briefly, SECA performs a bi-directional analysis, assessing concordance in the direction of the effect of AD-associated SNPs (data set 1) on each of the GIT disorders (data set 2) and vice versa. First, we conducted two rounds of P-value informed LD clumping (first clumping: -clump-r 2 0.1, -clump-kb 1000; second clumping: -clump-r 2 0.1, -clump-kb 10000) using PLINK 1.90 30 .
SECA subsequently assesses (using Fisher's test) the presence of excess SNPs in which the direction of effects is concordant across 144 subsets of data set 1 (AD GWAS) and data set 2 (each of the GIT traits GWAS).
We found a positive and significant concordance of SNP risk effect across the AD (data set 1) and each of the GIT GWAS (data  The 'clinically diagnosed AD' combined data from three case-control cohorts (N = 79,145). 'AD-by proxy' data were based on the UKB phenotype definition of individuals whose biological parents were affected by AD. The parent's current age, and where relevant, age at death were reported along with this GWAS data. The genetic correlation between the 'clinically diagnosed AD' and the 'AD-by proxy' is high at 0.81 21 , providing strong evidence or justification for combining them as more comprehensively described in the associated publication 21 . AD Alzheimer's disease, GERD and GORD gastroesophageal reflux disease, PUD peptic ulcer disease, PGM GWAS combining disease-diagnosis of PUD and/or GERD and/or medications for their treatments, IBS irritable bowel disease, IBD inflammatory bowel disease, ICD International Classification of Diseases, UKB United Kingdom Biobank. a UKB data code for case definition was from death register, primary care, hospital admissions data, self-report only, and other sources as described in the original publication Wu et al. 22 . The replication set data were used for reproducibility testing in LDSC and SECA analyses, and partly in LCV analysis.
set 2) including IBD (Table 3). For example, of the total 144 SNP subsets tested with AD as data set 1 (Table 3), all 144 (for GERD, PGM and gastritis-duodenitis), 139 (PUD), 133 (IBS), 130 (diverticulosis) and 42 (IBD) produced Fisher's exact tests with at least nominally significant effect concordance (odds ratio [OR] > 1 and P < 0.05). The empirical P values (P permuted ) for the significant associations, adjusting for the 144 SNP subsets tested (using permutations of 1000 replicates), range from 0.001 to 0.018 (Table 3). These results are significantly more than expected by chance, supporting evidence of genetic overlap between AD and the GIT traits. By changing the direction of the analysis (in a bidirectional assessment), we tested each of the GIT traits as data set 1 against AD as data set 2 ( Table 3). The results indicate evidence of a strong genetic overlap between AD and GERD, PUD, PGM, gastritis-duodenitis, IBS and diverticulosis. The results also suggest (except for IBD) that SNPs that are strongly associated with AD influence the named GIT traits and vice versa. Overall, findings in SECA are largely consistent with those of LDSC, except in the case of IBD-highlighting how SECA differs from (capacity for a bidirectional assessment) as well as complements LDSC. Notably, and like LDSC, SECA found a significant association between AD and GIT traits with or without the APOE region (Table 3 and Supplementary Data 4). Further, replication analyses in SECA produced largely consistent findings as with LDSC (Supplementary Data 5 and 6).
SNPs and loci shared by AD and GIT disorders. Leveraging the significant genetic overlap and correlation as well as the substantial GWAS sample sizes, we performed cross-disorder meta-analyses of AD with GERD and PUD. The GWAS for PGM has many cases and overall large sample size (Table 1) and is strongly correlated with GERD (r g = 0.99, P = 0.000) and PUD (r g = 0.76, We applied Bonferroni adjustment for testing the effects of seven GIT traits on AD (0.05/7 = 7.1 × 10 −3 ), and all genetic correlation results surviving this cut-off were considered significant while those having P < 0.05 were regarded nominally significant. AD Alzheimer's disease, GIT gastrointestinal tract, GERD gastroesophageal reflux disease, PUD peptic ulcer disease, IBS irritable bowel syndrome, PGM GWAS combining disease-diagnosis of PUD and/ or GERD and/or medications for their treatments, IBD inflammatory bowel disease, r g genetic correlation, se standard error, P P value, MHC major histocompatibility complex. AD Alzheimer's disease, GIT gastro-intestinal tract, GERD gastroesophageal reflux disease, PUD peptic ulcer disease, IBS irritable bowel syndrome, PGM GWAS combining disease-diagnosis of PUD and/ or GERD and/or medications for their treatments, IBD inflammatory bowel disease, SNP single-nucleotide polymorphism, P P value, MHC major histocompatibility complex. a The number of SNP subsets with nominally significant concordant effects is significantly MORE than expected by chance, indicating significant concordance of genetic risk between the pairs of traits. P = 4.41 × 10 −101 ) [Supplementary Data 7], hence, we also utilised it in a meta-analysis with AD. We aimed at identifying SNPs and loci which were not genome-wide significant in the individual AD or GIT disorder GWAS (i.e., 5 × 10 −8 < P GWAS-data < 0.05) but reached the status (P meta-analysis < 5 × 10 −8 ) following a metaanalysis. We additionally identified SNPs and loci which were already established (P GWAS-data < 5 × 10 −8 ) in AD (Sentinel AD SNPs/loci), but which, following GWAS meta-analyses, were similarly associated with a GIT disorder, and vice versa. Briefly, our GWAS meta-analyses identified shared SNPs and susceptibility loci, some of which are putatively novel for AD or GIT disorders. First, a meta-analysis of AD and GERD identified a total of 119 SNPs reaching genome-wide significant association (P meta-analysis < 5 × 10 −8 , Supplementary Data 8), from which we characterised seven independent (r 2 < 0.1) genomic loci-1p31.3, 1q31.1, 3p21.31, 6p21.32, 17q21.32, 17q21. 33, 19q13.32 (Table 4). Many SNPs reaching genome-wide significance in these loci were not genome-wide significant in the individual AD and GIT GWAS we analysed but reached the status in the cross-trait meta-analyses (Table 3). Given this premise (that is, P GWAS-data > 5 × 10 −8 < P meta-analysis ), the observation that some of the identified loci are known for AD or GIT traits (from other studies) provides support for our cross-trait analysis findings. Specifically, two of the identified loci: (1p31. 3  ) are putatively novel for GERD given we have no evidence they were previously genome-wide significant for the disorder. A locus at 1q31.1 (near BRINP3) was putatively novel for both AD and GERD at the time of our analysis but has now been reported in a recent GERD multi-trait analysis 31providing support for our finding. The remaining locus, 6p21.32 (near genes HLA-DQA2 and HLA-DRA) is known for both AD 32 and GIT disorders-IBD 33 , ulcerative colitis 34 and Crohn's disease 33 -and now (in our study), GERD.
Third, given its large sample size and strong genetic correlation with GERD and PUD, we performed a meta-analysis of PGM with AD thereby identifying 42 SNPs (Supplementary Data 14) at seven independent loci (Table 4) reaching a genome-wide significance level. This analysis replicated, at a genome-wide level (P meta-analysis < 5 × 10 −8 ), five of the seven genome-wide loci found in the AD and GERD meta-analysis including 1p31.3, 3p21.31, 6p21.32, 17q21.33 and 19q13.32. Additional loci found in the AD and PGM meta-analysis such as 16q22.1 and 1q32.2 were at least genome-wide suggestive (P meta-analysis < 1 × 10 −5 ) in the AD and GERD analysis, supporting their involvement in the disorders. An additional 23 SNPs, at three loci, were genomewide suggestive (P meta-analysis < 1 × 10 −5 ) in the AD and PGM meta-analysis (Supplementary Data 15). Of these, the rs33998678 SNP (16q22.1, IL34) is in strong LD (r 2 = 0.91) with a genomewide significant locus found in the AD vs PGM analysis (rs34644948, at 16q22.1, MTSS2, Table 4), providing more support for its involvement in AD and GIT traits (GERD and PUD). Similarly, the rs663576 SNP (at 17q21.32, PHOSPHO1) is moderately correlated (r 2 = 0.41) with a genome-wide significant SNP (rs2584662 at 17q21.33, PHB, Table 4), identified in the meta-analysis. This locus (17q21.33) was found in AD and GERD meta-analysis (SNP rs2584662 near PHB), supporting its involvement in AD and the GIT traits. Supplementary Data 10 summarises the sentinel AD loci associated with PGM and vice versa.
Association of identified loci with other traits. Seven loci reached a genome-wide significance in the meta-analysis of AD and GERD GWAS; most of these loci were replicated in the AD vs PUD and/or AD vs PGM meta-analysis. We queried each of the associated loci for pleiotropic associations with other traits using the GWAS catalogue (https://www.ebi.ac.uk/gwas) and the Open Targets Genetics (https://genetics.opentargets.org) platforms. For three of the loci-1p31.3 (near PDE4B), 3p21.31 (near SEMA3F), and 1q31.1 (near BRINP3)-we have no evidence of their previous association with AD, at a genome-wide level (P < 5 × 10 −8 ). However, and potentially supportive of our findings, the loci have been reported for AD-related phenotypes such as cognitive traits.
For example, PDE4B has pleiotropic associations with intelligence 40 , educational attainment 41 , and sleep-related traits such as insomnia 42 . The locus is also known for other disorders including major depression, stress disorders, schizophrenia, and multiple sclerosis 43 -putative comorbidities of AD 44,45 -among other traits. The loci harbouring SEMA3F and BRINP3 have similarly been reported for intelligence (SEMA3F 46 ), general cognitive ability (SEMA3F 40 ), educational attainment (SEMA3F 47 , BRINP3 41 ), insomnia (SEMA3F and BRINP3 42 ) and BMI (SEMA3F and BRINP3). Sex hormone-binding globulin levels 48 and multi-site chronic pain are some of the traits that have also been linked with SEMA3F. Interestingly, BMI, cognitive traits such as intelligence, cognitive performance and even sleeprelated traits have been associated with GERD 31 . Taken together, and in further support of their relationship, this observation, suggests that GERD may share genetic links with certain ADrelated phenotypes including cognitive and sleep-related traits.
Further, our analysis consistently identified and replicated the 19q13.32 locus (mapped genes: TOMM40, APOC2, KLC3, ERCC2, BCL3, and CD33) as shared by AD and GIT disorders. While this locus is well known for AD, it has also been linked with GIT traits including IBD 49 (SYMPK, lead SNP: rs16980051, GRCh37: 19:46,345,886), and gut microbiota 50 , thus, highlighting an association of AD with not only GIT disorders, but also the gut microbiome. This premise is important given previous evidence of genetic links between dysbiosis, neurological (AD, for instance) and GIT disorders 15,22,51,52 , and may underscore the need for a renewed focus on the genetics of gut-brain connection (including the gut microbiome) to better understand the underlying mechanisms of AD. Similar to other identified loci, the 19q13.32 locus also displays pleiotropic association with many AD-related phenotypes: intelligence 53 , cognitive impairment test score 54 , t-tau and beta-amyloid 1-42 measurements, hippocampal atrophy rate, memory performance, and educational attainment 41  Results of causal association analysis between AD and GIT disorders. We assessed the potential causal relationship between AD (as the outcome variable) and GERD (as the exposure variable) using the two-sample MR method. We found no evidence of a causal relationship between AD and GERD, irrespective of the direction of the analysis (AD or GERD as the outcome or exposure variable) [ Table 5]. For sensitivity testing, we implemented three additional models of MR analysis-MR-Egger, weighted median, and the MR-PRESSO (Mendelian Randomization Pleiotropy RESidual Sum and Outlier). Results from these methods agree with those of the Inverse Variance Weighted (IVW) model supporting a lack of evidence for a causal association between AD and GERD (Table 5 and Supplementary Data 19). We carried out further MR analysis assessing AD against each of PUD, PGM, IBS, diverticular disease, and IBD, and vice versa. Findings similarly reveal no evidence for a causal relationship between AD and each of the GIT disorders assessed (Supplementary Data 19). We also used the Latent Causal Variable (LCV) approach 29 to test for a causal relationship between AD and each of the GIT disorders. The results of LCV suggest a partial causal influence of gastritis-duodenitis (genetic causal proportion [GCP] = −0.69, P = 0.0026), on AD ( Table 6). The result was in the reverse direction for diverticular disease (GCP = 0.23, P = 0.000272), suggesting AD may partially cause diverticular disease. Using another set of GWAS (Table 6), we tested the reproducibility of the partial causal association results for gastritis-duodenitis and diverticular disease, neither of which was reproduced, hence, the need for the findings to be further assessed in future studies.  Conversely, we found a significant association between AD and lansoprazole use (GCP = −0.38, P = 0.001129).
Gene-based association analysis. Using SNPs that overlapped AD and GERD GWAS, we performed gene-based analyses in MAGMA (implemented in the FUMA 55 Data 24). We also replicated a similar pattern of findings in gene-based analysis (and FCP) using the AD and the PGM GWAS (Table 7 Supplementary Data 25).
Biological pathways and mechanisms shared by AD and GIT disorders. We performed pathway-based functional enrichment analyses in the g: Profiler platform 56 to functionally interpret genes overlapping AD and GIT disorders and gain biological insight from their commonalities. First, we investigated genes overlapping AD and GERD (at P gene < 0.05, FCP < 0.02) and identified several biological pathways that were overrepresented ( Fig. 2 and Supplementary Data 26), implying they have a role in the mechanisms underlying both AD and GERD. Pathways related to membrane trafficking and metabolism, alteration, lowering or inhibition of lipids were significantly enriched (Supplementary Data 26). These included plasma lipoprotein assembly, remodelling, and clearance (P adjusted = 2.01 × 10 −3 ), cholesterol metabolism (P adjusted = 4.99 × 10 −2 ), plasma lipoprotein assembly (P adjusted = 3.45 × 10 −5 ), and triglyceride-rich plasma lipoprotein particle (P adjusted = 5.23 × 10 −9 ), among others. Also, lipase inhibitors (P adjusted = 6.08 × 10 −3 ) and the statin (3-hydroxy-3-methylglutaryl-coenzyme A reductase inhibitors) pathway (P adjusted = 3.99 × 10 −2 ) were significantly enriched for AD and GERD (Supplementary Data 27), suggesting mechanisms of these medications may find therapeutic application in AD and GIT disorders. Pathways related to the immune system were also overrepresented for both AD and GERD as evidenced by the identification of immune or autoimmune-related disorders such   Following enrichment mapping and auto-annotation, the identified biological pathways were clustered into six themes of biological mechanisms, namely: 'lipoprotein particle clearance,' 'receptor signalling pathway,' 'side membrane vesicle and cell adhesion,' 'peptide antigen binding,' 'intestinal immune network,' and 'interferon-gamma signalling' (Fig. 2). Moreover, a pathwaybased analysis using genes that overlapped AD and PGM GWAS (at P gene < 0.05) replicated some of the pathways identified for AD and GERD, including 'plasma lipoprotein assembly, remodelling, and clearance' (P adjusted = 3.01 × 10 −4 ), 'peptide antigen binding' (P adjusted = 2.28 × 10 −3 ), and 'triglyceride-rich plasma lipoprotein particle' (P adjusted = 6.60 × 10 −8 ) [Supplementary Data 27]. Also, we performed pathway-based analysis separately for GERD and AD GWAS, the full results of which are presented in Supplementary Data 28 and 29, respectively.

Discussion
We present the first comprehensive assessment (to the best of our knowledge) of the shared genetics of AD with GIT disorders by analysing large-scale GWAS summary data using multiple statistical genetic approaches. Consistent with previous conventional observational studies [3][4][5][6][7][8][9] , our findings confirm a risk-increasing membrane, clathrin-coated endocytic vesicle membrane, late endosome, ER to Golgi transport vesicle membrane, coated vesicle membrane, lumenal side of ER membrane, MHC protein complex, COPII-coated ER to Golgi transport vesicle, transport vesicle membrane, late endosome membrane), and plasma lipoprotein particle (chylomicron, very low-density lipoprotein [VLDL] particle, triglyceride-rich plasma lipoprotein particle, plasma lipoprotein particle, lipoprotein particle, LDL lipoprotein particle). c Gene Ontology: Molecular Function: peptide antigen binding (peptide binding, peptide antigen binding, MHC class II receptor activity) and lipase inhibitor activity (lipase inhibitor activity). d Gene Ontology: Biological Pathway: lipoprotein particle clearance (phospholipid efflux, VLDL particle clearance, regulation of plasma lipoprotein particle levels, plasma lipoprotein particle clearance, chylomicron remnant clearance, regulation of lipid catabolic process, regulation of VLDL particle clearance, protein-lipid complex assembly, plasma lipoprotein particle organisation, regulation of phospholipid catabolic process, VLDL particle assembly, regulation of lipid localisation, glycolipid catabolic process, triglyceriderich lipoprotein particle clearance, high density lipoprotein particle remodelling), receptor signalling pathway (T cell receptor signalling pathway, interferongamma-mediated signalling pathway, antigen receptor-mediated signalling pathway), membrane adhesion cell (cell-cell adhesion via plasma membrane adhesion molecules, homophilic cell adhesion via plasma membrane adhesion molecules), and negative regulation type (negative regulation of type I interferon production). e Reactome, Wiki pathway and Transcription Factor Binding site: assembly clearance plasma (statin pathway, NR1H2 and NR1H3mediated signalling, plasma lipoprotein assembly, remodelling, and clearance, plasma lipoprotein clearance, NR1H3 and NR1H2 regulated gene expression linked to cholesterol transport and efflux, VLDL assembly, VLDL clearance, plasma lipoprotein assembly), interferon-gamma signalling (PD-1 signalling, generation of second messenger molecules, interferon-gamma signalling phosphorylation of CD3 and TCR ZETA chains, translocation of ZAP-70 to Immunological synapse), Factor: ZNF2 motif, and ZNF582 motif. Supplementary Data 26 provides additional details about these biological pathways. AD Alzheimer's disease, GERD gastroesophageal reflux disease. relationship between AD and GIT disorders and provide insights into their underlying biological mechanisms. In contrast to the positive genetic correlation between AD and other GIT disorders, LDSC found no significant genetic correlation between AD and IBD, which may be due to the relatively small number of cases and sample size of the IBD GWAS. Based on the effective sample size estimates, the IBD GWAS is underpowered compared to other GIT data sets. Supporting this premise, SECA revealed a significant association between AD (as data set 1) against IBD (as data set 2), but not the other way around. The AD GWAS has a larger sample size, providing a more robust association on which to condition (select independent) SNPs for concordance analysis which may explain why the significant association was not bidirectional unlike the case for other GIT traits. Future studies, nonetheless, need to confirm this relationship, as more powerful IBD GWAS becomes available.
Evidence of significant genetic overlap and correlation reflects not only shared genetic aetiologies (biological pleiotropy) but also suggests a possible causal association between AD and the GIT traits (vertical pleiotropy). Using LCV, we detected a partial causal association between AD and gastritis-duodenitis, lansoprazole, and diverticular disease. However, this partial causal association was not evident in reproducibility testing. The inconclusive LCV findings should be cautiously interpreted, and a reassessment of the results, in future studies, is warranted. Conversely, all MR analyses provided no evidence for a significant causal relationship between AD and GIT traits, indicating that shared genetics and common biological pathways may best explain the association between AD and these GIT disorders.
We identified biological pathways, significantly enriched for genes overlapping AD and GIT disorder (GERD, and PUD) GWAS in pathway-based analyses. Notably, lipid-related, and autoimmune pathways were overrepresented. There is a close link between autoimmunity and lipid abnormalities 64 , and consistent with previous studies [65][66][67][68][69] , our findings highlight the importance of lipids homoeostasis in AD and GIT traits. In AD, for example, hypercholesterolaemia is believed to increase the permeability of the blood-brain barrier system, facilitating the entry of peripheral cholesterol into the CNS, and resulting in abnormal cholesterol metabolism in the brain 65,66 . Amyloidogenesis, alteration of the amyloid precursor protein degradation, accumulation of Aβ, and subsequent cognitive impairment have all been linked with elevated cholesterol in the brain 66,[70][71][72] . Similarly, while the exact roles of lipids in GIT disorders are unclear, H. pylori is believed to cause or worsen abnormal serum lipid profiles through chronic inflammatory processes, and eradication of the infection enhances lipid homoeostasis 68,69 .
The mechanisms of association between AD and lipid dysregulation relate to the 'gut-brain axis', alterations in GIT microbiota and the immune system 10,66 . Moreover, lipid dysregulation is central to the interplay of AD, gut microbiota, and GIT disorders 10,66 , thus, suggesting the therapeutic potential of lipidlowering medications such as lipase inhibitors and statins (identified in our study) in AD and GIT disorders. Lipase inhibitors (orlistat) prevent intestinal dietary lipid absorption, and lower total plasma triglycerides and cholesterol levels 73,74 , making them a preferred pharmacological treatment for obesity 73 . The connection between AD, lipid dysregulation, dysbiosis and the 'gut-brain axis' 10,66 , may, thus, support the potential utility of lipase inhibitors in AD. Lipases, including monoacylglycerol, diacylglycerol, and lipoprotein lipases are involved in AD pathology, and can also effectively be inhibited by orlistat 74 . Similarly, statins possess anti-inflammatory, immune-modulating and gastroprotective properties 75,76 , and their active use significantly reduced PUD risk 76 as well as enhanced H. pylori eradication 77 . Statins also improve cognitive ability and reduce neurodegeneration risks, making them potentially beneficial in AD 78,79 . However, there is evidence suggesting a paradoxical predisposition to reversible dementia for statins 78,79 . While this finding has been challenged 78 , it may highlight a need to identify AD patients for whom statins will be beneficial, consistent with the model of personalised health.
Our findings have implications for practice and further studies. First, results highlighting lipid-related mechanisms support the roles of abnormal lipid profiles in the aetiologies of the disorders, which may be potential biomarkers for AD and GIT disorders (or their comorbidity). Second, our findings underscore the importance of lipid homoeostasis. The dietary approach is one effective preventive as well as non-pharmacologic approach for the management of hyperlipidaemia, and overall, this is consistent with findings in this study. Indeed, adherence to a 'Mediterranean' diet (low in lipids) is recognised as beneficial both in AD 80 and GIT disorders 81 . Thus, a recommendation for healthy diets, early in life, may form part of the lifestyle modifications for preventing AD and GIT disorders. The clinical utility of these recommendations will need to be further investigated and validated. Third, our study identifies lipase inhibitors and statin pathways in the mechanisms of AD and GIT disorders, which may be a potential therapeutic avenue to explore in the disorders. We hypothesise that individuals with comorbid AD and GIT traits may gain benefits from these therapies. There is a need to test this hypothesis using appropriate study designs including randomised control trials. Fourth, our study implicates the PDE4B, and given the evidence in the literature 58-61 , we propose that treatment targeted at its inhibition may be promising in comorbid AD and GIT traits. Lastly, while our findings do not necessarily indicate that AD and GIT disorders will always co-occur, they support their shared biology; thus, early detection of AD may benefit from probing impaired cognition in GIT disorders.
The use of multiple, complementary statistical genetic approaches enables a comprehensive analysis of the genetic associations between AD and GIT disorders and is a major strength of this study. Also, we analysed well-powered GWAS data, meaning our findings are generally not affected by small sample size, possible reverse causality, or confounders that conventional observational studies often suffer from. Nonetheless, our study has limitations that should be considered alongside the present findings. First, the GWAS for AD combined clinically diagnosed cases of AD with proxies (AD-by-proxy-individuals whose parents were diagnosed with AD). Given the high correlation between the GWAS with and without the 'AD-byproxy' cases 21 , we argue as did others 21 that combining them is valid, especially for sample size improvement, which is critical to ensuring adequately powered GWAS analysis. Second, analyses were restricted to participants of mainly European ancestry in our study, thus, findings may not be generalisable to other ancestries. Third, GIT traits GWAS were combinations of several data sources: primary care, hospital admission, medication use, and self-reported records. While there is a potential for misdiagnosis or accuracy of self-reported data, their use is well justified given the correlation in effect sizes of the data with other sources 22 . Moreover, additional data from other sources including ICD-10 were utilised with consistent results across these GWAS.
In conclusion, this study provides genetic insights into the long-standing debate and the observed relationship of AD with GIT disorders, implicating shared genetic susceptibility. Our findings support a significant risk increasing (but non-causal) genetic association between AD and GIT traits (GERD, PUD, PGM, gastritis-duodenitis, IBS, and diverticular disease). Also, we identified genomic regions and genes, shared by AD and GIT disorders that may potentially be targeted for further investigation, particularly, the PDE4B gene (or its subtypes) which has shown promise in inflammatory diseases [57][58][59][60] . Our study also underscores the importance of lipid homoeostasis and the potential relevance of statins and lipase inhibitors in AD, GIT disorders or their comorbidity. To our knowledge, this is the first comprehensive study to assess these relationships using statistical genetic approaches. Overall, these findings advance our understanding of the genetic architecture of AD, GIT disorders, and their observed co-occurring relationship.
Methods GWAS summary statistics. The GWAS data utilised in the present study are summarised in Table 1 with further cohort-specific details, including effective sample size estimates, provided in Supplementary Data 1. The data were sourced from popular GWAS databases, repositories, and large research consortia/groups. The GWAS summary data for 'clinically diagnosed AD and AD-by-proxy' 21 Data 1). Clinically, PUD medications are indicated in GERD and gastritis, accordingly, GWAS combining diagnosis for PUD and/or GERD and/or medications commonly used for these disorders (PGM) have been conducted 22 , potentially identifying people with PUD or GERD. This GWAS has a large sample size (cases = 90,175, controls = 366,152, N = 456,327), and as was the case in the original publication 22 , we utilised the data for analysis in the present study, as a proxy for PUD or GERD. These GIT GWAS were well characterised and, where possible, validated as described in the original publication 22 .
Additionally, we utilised a well-characterised GWAS for GERD (cases = 71,522, controls = 261,079, N = 332,601), which combined data sets from the UK Biobank and the QSKIN study 23 . Gastritis-duodenitis (cases = 28,941, controls = 378,124, N = 407,065) and diverticular disease (cases = 27,311, controls = 334,783, N = 362,094) GWAS from the Lee Lab (https://www.leelabsg.org/resources) were also used in this study. We utilised additional (available) GWAS summary data (Table 1 and Supplementary Data 1) sourced from public repositories used for possible replication of our genetic overlap and correlation (LDSC and SECA) findings. A comprehensive description of the quality control procedures for each of the GWAS data and their analysis are available through the corresponding publications (Table 1 and Supplementary Data 1). Our preliminary analysis indicates that there is no significant sample overlap between the AD GWAS and each of the GIT GWAS assessed in this study (Supplementary Data 2), ruling out the possibility of bias from such occurrence.
Linkage disequilibrium score regression analysis (LDSC). We assessed and quantified SNP-level genetic correlation between AD and GIT disorders using the LDSC 25 analysis method (https://github.com/bulik/ldsc/wiki/Heritability-and-Genetic-Correlation). LDSC assesses and distinguishes the contributions of polygenicity, sample overlaps, and population stratification to the heritability and genetic correlation between traits 25 . In the present study, we performed LDSC analysis using the standalone version of the software and by following the procedures provided by the program developer (https://github.com/bulik/ldsc). The apolipoprotein E (APOE) region has a large effect on the risk of AD; hence, we excluded APOE and the 500 kilobase (kb) flanking region (hg19, 19:44,909,039-45,912,650) from the AD GWAS for this analysis. We also excluded SNPs in the 26-36 megabase region of chromosome six from the data given the complex LD structure in the human major histocompatibility complex (MHC). To assess possible sample overlap between AD GWAS and each of the GIT GWAS, we performed LDSC correlation analysis with the genetic covariance intercept unconstrained. The result of this analysis indicates that the estimated genetic covariance intercepts were not significantly different from zero (Supplementary Data 2), indicating no significant sample overlap between our AD and GIT GWAS. Thus, we constrained the intercept in the reported genetic correlation analysis. We applied Bonferroni adjustment for testing the effects of seven GIT traits on AD (0.05/7 = 7.1 × 10 −3 ), and all genetic correlation results surviving this adjustment were considered significant while those having P < 0.05 were regarded as nominally significant.
SNP effect concordance analysis (SECA). We used the standalone version of the SECA software pipeline to perform SNP-level genetic overlap assessment and statistical tests between AD and GIT disorders. A detailed description of the SECA software and methods has been published 26 . Briefly, SECA accepts a pair of GWAS data (data set 1 and data set 2) as input and performs a range of analyses to assess concordance in effect direction between a pair of traits-AD and GIT disorders in the present study. First, we carried out quality control to exclude all non-rsID(s) and duplicate variants in data set 1 and align SNP effects to the same effect allele across data set 1 and data set 2. Second we performed two rounds of P-value informed LD clumping in data set 1 (first clumping: -clump-r 2 0.1, -clump-kb 1000; second clumping: -clump-r 2 0.1, -clump-kb 10000) using PLINK 1.90 30 .
Third, SECA partitions independent SNPs resulting from LD clumping into 12 subsets of SNPs according to the P value for data set 1 as follows: P1 ≤ (0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0). SECA subsequently performs Fisher's exact tests to assess the presence of excess SNPs in which the direction of effects is concordant across data set 1 and data set 2 (that is, for the corresponding P value derived 12 subsets of SNPs associated in data set 2, P2). Hence, a total of 144 SNP subsets (a 12 by 12 matrix from data set 1 and data set 2) were assessed for SNP effect concordance. SECA calculates permuted P value for the number of significant associations with adjustment for testing 144 associations (based on permutations of 1000 replicates).
In the present study, we first assessed AD GWAS as data set 1 and each of the GIT disorders as data set 2. For comparison, we also assessed each of the GIT disorders as data set 1 against AD as data set 2. Thus, using SECA, we assessed the effects of AD-associated SNPs on each of the GIT disorders and vice versa. Since SECA is conditioned on data set 1, the bi-directional assessment is an important analysis step to account for instances where SNPs that are strongly associated with AD do not affect GIT traits and vice versa. Further, the bi-directional analysis (which is not possible with LDSC, for example) enables the assessment of whether the observed genetic overlap is driven primarily by only one of the traits or both thereby enhancing a better understanding of their association.
GWAS cross-traits meta-analysis. GWAS meta-analysis pools the results of GWAS data, thereby increasing the sample sizes and augmenting the detection of genetic variants with small to moderate effect sizes. In the present study, we used the GWAS meta-analysis method of pooling AD GWAS with each of the GIT traits (cross-disorder or cross-trait meta-analysis). We used two models of meta-analysis: the Fixed Effect (FE), and the modified Random Effect (RE2) 82 models. The FE model estimates the FE P-value using the inverse-variance weighted method, which assumes that the AD and each of the GIT disorders' GWAS are assessing the same (fixed) effect. The presence of effect heterogeneity is a limitation of the model. On the other hand, by estimating P-values using the modified random effects, the RE2 model 82 allows for differences in SNP effects and the method is powerful in the presence of SNP effect heterogeneity.
Genomic loci characterisation. Using the outputs of our cross-trait meta-analyses for AD and each of the GIT disorders, we carried out some downstream analyses including functional annotation of SNPs, and genomic loci characterisation in line with practice in the previous studies 13,55,83,84 . Briefly, SNPs that were not genomewide significant in the individual AD and GIT disorder GWAS, but which reached genome-wide significance following the meta-analysis were identified. From these, we characterised independent SNPs at r 2 < 0.6, and lead SNPs at r 2 < 0.1. We defined the genomic locus as the region within 250 kb of each lead SNP. We assigned lead SNPs within this region to the same locus, meaning two or more lead SNPs may be present in one locus. We performed these downstream analyses using the Functional Mapping and Annotation (FUMA) software (an online platform) 55 . We subsequently queried identified loci in the GWAS catalogue (https://www.ebi. ac.uk/gwas) and Open Targets Genetics (https://genetics.opentargets.org) to assess their previous identification for AD, GIT disorders or other traits.
Pairwise GWAS analysis. We performed a co-localisation analysis utilising the pairwise GWAS (GWAS-PW) method 27 to further assess the regions in the genome shared by AD and GIT disorders. Briefly, GWAS-PW software implements the Bayesian pleiotropy association test and identifies genomic regions that influence a pair of correlated traits 27 . We used this method to assess whether the loci reaching genome-wide significance in our GWAS meta-analyses were truly shared by AD and the GIT disorders. Also, we investigated other shared genomic regions which may not have been found in the GWAS meta-analysis. We combined the summary data for AD with the data for each of the GIT disorders and estimated the posterior probability of association (PPA) of a genomic region using the GWAS-PW software. We modelled four PPAs: (i) that a genomic region is associated with AD only (PPA-1), (ii) that a genomic region is associated with the GIT trait only (PPA-2), (iii) that a genomic region is associated with both AD and the GIT trait and the causal variant is the same (PPA-3), and (iv) that a specific genomic region is associated with both AD and the GIT trait but through separate causal variants (PPA-4) 27 .
Causal relationship assessment. Using MR 28 analysis methods, we assessed the causal association between AD and each of the GIT disorders in this study. Mimicking randomised control trials (RCTs), MR analysis incorporates genetics into epidemiological study designs to assess causality 28 . The method is based on the principle of instrumental variables and underpinned by three primary assumptions. First is the relevance assumption which requires that the chosen instruments are robustly associated with the exposure variable 85 . Second is the independence assumption which states that the instruments must not be associated with confounders of the exposure-outcome variables 85 . Last is the assumption of exclusion which demands that the instruments influence the outcome only through their relationship with the exposure variable 85 .
In the present study, we used the two-sample MR method (https://mrcieu. github.io/TwoSampleMR/articles/introduction.html) for a bidirectional association assessment between AD and each of the GIT disorders. In the first round of analysis (AD as exposure variable), independent (r 2 < 0.001) genome-wide significant SNPs (P < 5 × 10 −8 ) associated with AD were utilised as instrumental variables (IVs) and assessed against each of the GIT disorders' GWAS (outcome variables) analysed in this study. This analysis assesses whether genetic predisposition to AD is causally associated with any of the GIT traits included in the present study. Reversing the direction of analysis, independent SNPs robustly associated with each of the GIT disorders' GWAS (exposure variables) were similarly utilised as IVs and assessed against AD (as the outcome variable). In this instance, we assessed the potential causal effects of GIT traits on AD.
We used the inverse variance weighted (IVW) model of MR as the primary method for causal association assessment, and for validity testing, we performed a heterogeneity test (Cochran's Q-test), a 'leave-one-out' analysis, a horizontal pleiotropy check (MR-Egger intercept) and individual SNP MR analyses. Also, we used other MR analysis models including the MR-Egger, weighted median 86,87 , and the 'Mendelian randomisation pleiotropy residual sum and outlier' (MR-PRESSO) 88 methods for sensitivity testing. The MR-Egger and weighted median models operate under weaker assumptions of MR and are designed to provide valid causal estimates even when horizontal pleiotropy is present in all (MR-Egger) or as much as 50% (weighted median) of selected IVs 86,87 . Conversely, the MR-PRESSO method can detect and correct horizontal pleiotropy by excluding outlier IVs thereby improving valid causal estimates 88 . All MR analyses were performed in R (4.0.2).
We performed an additional assessment of the causal or partial causal association between AD and each of the GIT disorders using the Latent Causal Variable (LCV) method 29 . LCV estimates causality proportion (GCP) ranging from −1 to 1 where a value close to 1 indicates a potential causal association between two traits in the forward direction and −1 in the backward direction 29 . LCV corrects for heritability and genetic correlation between traits and is not limited by sample overlap 29 . This analysis was performed in the online platform of the Genetics of Complex Traits (CTG) virtual laboratory (https://vl.genoma.io/ analyses/lcv) 29,89 .
Gene-based association analysis. We performed gene-based association analyses to identify genome-wide significant genes shared by both AD and each of the GIT disorders assessed in this study. This analysis complements the SNP-based studies. However, beyond the SNP level, gene-based association analysis provides greater power for identifying genetic risk variants since it aggregates the effects of multiple SNPs, and it is generally not limited by small effect sizes or correlations among SNPs. Moreover, genes are more closely related to biology than SNPs, meaning gene-level analysis can provide better insights into the underlying biological mechanisms of complex traits.
In the present study, we carried out gene-based association analysis separately for AD and GERD using the multi-marker analysis of genomic annotation (MAGMA) software, implemented in the FUMA (https://fuma.ctglab. nl/) 55 platform. We defined gene boundaries length within ±0 kb outside the gene, and to ensure that equivalent gene-based tests were performed, we utilised SNPs overlapping AD and GERD GWAS in analysis separately for each of the traits. Following a similar procedure, we also performed gene-based analysis using SNPs overlapping AD and PGM GWAS.
Based on the results of the gene-based analysis, we identified genome-wide significant genes for each of the traits-AD, GERD and PGM-at an adjusted P value of 2.64 × 10 −6 (0.05/18929: Bonferroni adjustment for testing 18,929 genes). Further, to identify genes shared by AD and each of GERD and PGM, we extracted their overlapping genes at gene P value <0.1 (P gene < 0.1). We combined the respective P values for AD and the GIT traits using Fisher's Combined P-value (FCP) method and thereafter identified shared genes reaching genome-wide significance for AD and each of GERD and PGM in the FCP analyses.
Pathway-based functional enrichment analysis. For a better understanding of the potential biological mechanisms underlying AD and GIT disorders or their comorbidity, we carried out pathway-based functional enrichment analyses using the online platform of the g:GOst tool in the g-profiler software 56 . The g:GOst tool performs analysis on the list of user-inputted genes and queries relevant databases including Gene Ontology, Human Protein Atlas, WikiPathway, Human Phenotype Ontology, CORUM, Kyoto Encyclopedia of Genes (KEGG), and Reactome. This analysis enables us to functionally interpret genes overlapping AD and GIT disorders. We included genes that were overlapping between AD and each of GERD and PGM at P gene < 0.05 (FCP < 0.02) in this analysis, and followed established protocols 90 . Functional category term sizes were restricted to values from 5 to 350 90 . For multiple testing corrections, we applied the default 'g: SCS algorithm' recommended in the protocol 90 and reported the significantly enriched biological pathways at the multiple testing adjusted P value [P adjusted ] < 0.05.
Statistics and reproducibility. We performed statistical analysis mainly in the Unix environment and the R (https://www.r-project.org/) software. Additional software including Python (https://www.python.org/), Plink (https://www.coggenomics.org/plink/) and online platforms (CTG virtual lab: https://vl.genoma.io/ updates, G-profiler: https://biit.cs.ut.ee/gprofiler, and FUMA: https://fuma.ctglab. nl) were utilised. Adjustment for multiple testing was carried out using the Bonferroni approach in LDSC, gene-based and meta-analyses. In G-profiler, we applied the recommended inbuilt 'g: SCS algorithm' for multiple testing corrections. To enable us to test the reproducibility of AD and GIT association, we used available GIT data for further analysis.
Ethics approval and consent to participate. This study is a secondary analysis of existing GWAS summary data from public repositories, and international research consortia. Specific and relevant ethics approval for each of the data utilised is presented in the associated publications described in the section for GWAS summary data. No additional ethics approval is required for the conduct of the present study.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
All data generated during this study are included in the published article and its Supplementary section. GWAS summary statistics data analysed were sourced from international research consortia and public repositories as described in the subsection for GWAS summary data. The data are freely available and accessible online through the links and references provided within this study. Supplementary Data 1 provides a comprehensive description of the data and how to access them.